home *** CD-ROM | disk | FTP | other *** search
- Subject: Re: tosfs, faster hash function...
- Date: Wed, 23 Feb 94 23:16:44 CET
- From: Juergen Lock <nox@jelal.north.de>
- In-Reply-To: <9402221314.AA11542@math.uni-muenster.de>; from "Julian Reschke" at Feb 22, 94 2:14 pm
- Message-Id: <9402232216.AA00510@jelal.north.de>
-
- Julian Reschke writes:
-
- > Re inodes: everybofy with large TOS filesystems should verify that
- > the (computed) inodes indeed are unique (as I said: it's highly
- > probale, but it can't be guaranteed). If you have gnu-find or my
- > find.ttp (mupftl02.tos, ftp.uni-muenster.de, pub/atari/Gemini),
- > you can try the following script:
- >
- > find \ -printf "%i\n" > inodes.all
- > sort inodes.all > inodes.tot
- > uniq inodes.tot > inodes.uni
- > wc -l inodes.tot inodes.uni
-
- or diff -u ...
-
- > rm inodes.tot inodes.uni inodes.all
- >
- ..but thats not why i followup :)
-
- >+
- >+ #define UPDC32(octet, crc) (((crc) << 8) ^ crctab[((crc) >> 24) ^ (octet)])
-
- hmm shifts are slow on 68000... here is one that should be a bit faster,
- from dbz (cnews):
-
- /*
- * This is a simplified version of the pathalias hashing function.
- * Thanks to Steve Belovin and Peter Honeyman
- *
- * hash a string into a long int. 31 bit crc (from andrew appel).
- * the crc table is computed at run time by crcinit() -- we could
- * precompute, but it takes 1 clock tick on a 750.
- *
- * This fast table calculation works only if POLY is a prime polynomial
- * in the field of integers modulo 2. Since the coefficients of a
- * 32-bit polynomial won't fit in a 32-bit word, the high-order bit is
- * implicit. IT MUST ALSO BE THE CASE that the coefficients of orders
- * 31 down to 25 are zero. Happily, we have candidates, from
- * E. J. Watson, "Primitive Polynomials (Mod 2)", Math. Comp. 16 (1962):
- * x^32 + x^7 + x^5 + x^3 + x^2 + x^1 + x^0
- * x^31 + x^3 + x^0
- *
- * We reverse the bits to get:
- * 111101010000000000000000000000001 but drop the last 1
- * f 5 0 0 0 0 0 0
- * 010010000000000000000000000000001 ditto, for 31-bit crc
- * 4 8 0 0 0 0 0 0
- */
-
- #define POLY 0x48000000L /* 31-bit polynomial (avoids sign problems) */
-
- static long CrcTable[128];
-
- /*
- - crcinit - initialize tables for hash function
- */
- static void
- crcinit()
- {
- register int i, j;
- register long sum;
-
- for (i = 0; i < 128; ++i) {
- sum = 0L;
- for (j = 7 - 1; j >= 0; --j)
- if (i & (1 << j))
- sum ^= POLY >> j;
- CrcTable[i] = sum;
- }
- DEBUG(("crcinit: done\n"));
- }
-
- /*
- - hash - Honeyman's nice hashing function
- */
- static long
- hash(name, size)
- register char *name;
- register int size;
- {
- register long sum = 0L;
-
- while (size--) {
- sum = (sum >> 7) ^ CrcTable[(sum ^ (*name++)) & 0x7f];
- }
- DEBUG(("hash: returns (%ld)\n", sum));
- return(sum);
- }
-
- i have not done tests which one is better for filenames but there should
- be no difference.
-
- >+
- >+ static unsigned long
- >+ filename_crc (const char *filename)
- >+ {
- >+ unsigned long s = 0;
- >+ unsigned int n = 0;
- >+
- >+ /* skip x: */
- >+ filename += 2;
- >+
- >+ while (*filename) {
- >+ s = UPDC32 (*filename++, s);
- >+ n++;
- >+ }
- >+
- >+ while (n != 0) {
- >+ s = UPDC32 (n & 0377, s);
- >+ n >>= 8;
- >+ }
- >+
- >+ return s;
- >+ }
- >+
- >+ #endif
-
- does the second loop improve results?
-
- just a thought...
- Juergen
-
- PS: there is a problem too with using start clusters: 0 byte files...
- --
- J"urgen Lock / nox@jelal.north.de / UUCP: ..!uunet!unido!uniol!jelal!nox
- ...ohne Gewehr
- PGP public key fingerprint = 8A 18 58 54 03 7B FC 12 1F 8B 63 C7 19 27 CF DA
-